Scene Structure Inference through Scene Map Estimation
نویسندگان
چکیده
Understanding indoor scene structure from a single RGB image is useful for a wide variety of applications ranging from the editing of scenes to the mining of statistics about space utilization. Most efforts in scene understanding focus on extraction of either dense information such as pixel-level depth or semantic labels, or very sparse information such as bounding boxes obtained through object detection. In this paper we propose the concept of a scene map, a coarse scene representation, which describes the locations of the objects present in the scene from a top-down view (i.e., as they are positioned on the floor), as well as a pipeline to extract such a map from a single RGB image. To this end, we use a synthetic rendering pipeline, which supplies an adapted CNN with virtually unlimited training data. We quantitatively evaluate our results, showing that we clearly outperform a dense baseline approach, and argue that scene maps provide a useful representation for abstract indoor scene understanding.
منابع مشابه
An Improved Motion Vector Estimation Approach for Video Error Concealment Based on the Video Scene Analysis
In order to enhance the accuracy of the motion vector (MV) estimation and also reduce the error propagation issue during the estimation, in this paper, a new adaptive error concealment (EC) approach is proposed based on the information extracted from the video scene. In this regard, the motion information of the video scene around the degraded MB is first analyzed to estimate the motion type of...
متن کاملThree dimensional scene reconstruction from passive imagery
This paper describes a new approach, developed under the auspices of the Electro Magnetic Remote Sensing Defence Technology Centre, for reconstruction of three dimensional scene content using passive video. This gradient based scheme uses the parallax in an image sequence to estimate the distance from sensor to scene. Intensity measurements from a passive camera are combined with a rigid body m...
متن کاملGuaranteed Parameter Estimation of Discrete Energy Minimization for 3D Scene Parsing
Point clouds data, obtained from RGB-D cameras and laser scanners, or constructed through structural from motion (SfM), are becoming increasingly popular in the field of robotics perception. To allow efficient robot interaction, we require not only the local appearance and geometry, but also a higher level understanding of the scene. Such semantic representation is also necessary for as-built B...
متن کاملCombining Monocular Geometric Cues with Traditional Stereo Cues for Consumer Camera Stereo
This paper presents a framework for considering both stereo cues and structural priors to obtain a geometrically representative depth map from a narrow baseline stereo pair. We use stereo pairs captured with a consumer stereo camera and observe that traditional depth estimation using stereo matching techniques encounters difficulties related to the narrow baseline relative to the depth of the s...
متن کاملRobust Video Mosaicing through Topology Inference and Local to Global Alignment
The problem of piecing together individual frames in a video sequence to create seamless panoramas (video mosaics) has attracted increasing attention in recent times. One challenge in this domain has been to rapidly and automatically create high quality seamless mosaics using inexpensive cameras and relatively free hand motions. In order to capture a wide angle scene using a video sequence of r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016